Fast k-Nearest Neighbour Search via Dynamic Continuous Indexing
نویسندگان
چکیده
Existing methods for retrieving k-nearest neighbours suffer from the curse of dimensionality. We argue this is caused in part by inherent deficiencies of space partitioning, which is the underlying strategy used by almost all existing methods. We devise a new strategy that avoids partitioning the vector space and present a novel randomized algorithm that runs in time linear in dimensionality and sub-linear in the size of the dataset and takes space constant in dimensionality and linear in the size of the dataset. The proposed algorithm allows fine-grained control over accuracy and speed on a per-query basis, automatically adapts to variations in dataset density, supports dynamic updates to the dataset and is easy-to-implement. We show appealing theoretical properties and demonstrate empirically that the proposed algorithm outperforms localitysensitivity hashing (LSH) in terms of approximation quality and speed.
منابع مشابه
Fast k-Nearest Neighbour Search via Prioritized DCI
Most exact methods for k-nearest neighbour search suffer from the curse of dimensionality; that is, their query times exhibit exponential dependence on either the ambient or the intrinsic dimensionality. Dynamic Continuous Indexing (DCI) (Li & Malik, 2016) offers a promising way of circumventing the curse by avoiding space partitioning and achieves a query time that grows sublinearly in the int...
متن کاملShape Indexing Using Approximate Nearest-Neighbour Search in High-Dimensional Spaces
Shape indexing is a way of making rapid associations between features detected in an image and object models that could have produced them. When model databases are large, the use of high-dimensional features is critical, due to the improved level of discrimination they can provide. Unfortunately, finding the nearest neighbour to a query point rapidly becomes inefficient as the dimensionality o...
متن کاملExtending LAESA Fast Nearest Neighbour Algorithm to Find the k Nearest Neighbours
Many pattern recognition tasks make use of the k nearest neighbour (k–NN) technique. In this paper we are interested on fast k– NN search algorithms that can work in any metric space i.e. they are not restricted to Euclidean–like distance functions. Only symmetric and triangle inequality properties are required for the distance. A large set of such fast k–NN search algorithms have been develope...
متن کاملSome improvements on NN based classifiers in metric spaces
The nearest neighbour (NN) and k-nearest neighbour (k-NN) classification rules have been widely used in Pattern Recognition due to its simplicity and good behaviour. Exhaustive nearest neighbour search may become unpractical when facing large training sets, high dimensional data or expensive dissimilarity measures (distances). During the last years a lot of fast NN search algorithms have been d...
متن کاملExtending Fast Nearest Neighbour Search Algorithms for Approximate k-NN Classification
The nearest neighbour (NN) and k-nearest neighbour (kNN) classi cation rules have been widely used in pattern recognition due to its simplicity and good behaviour. Exhaustive nearest neighbour search can become unpractical when facing large training sets, high dimensional data or expensive similarity measures. In the last years a lot of NN search algorithms have been developed to overcome those...
متن کامل